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Abstract 

Background: The actin cytoskeleton is essential for many physiological processes of eukaryotic cells. The 
emergence of new actin fibers is initiated by actin nucleators. Whereas most of them are evolutionary old, the 
cordon-bleu actin nucleator is classified as vertebrate specific. 

Findings: Using sensitive methods for sequence similarity detection, we identified homologs of cordon-bleu not 
only in non-vertebrate chordates but also in arthropods, molluscs, annelids and platyhelminthes. These genes 
contain only a single WH2 domain and therefore resemble more the vertebrate cordon-bleu related 1 protein than 
the three WH2 domain containing cordon-bleu. Furthermore, we identified a homolog of the N-terminal, ubiquitin 
like, cobi domain of cordon-bleu in the cnidarian Nematostella vectensis. 

Conclusion: Our results suggest that the ur-form of the cordon-bleu protein family evolved already with the 
emergence of the bilateria by the combination of existing cobI and WH2 domains. Following a vertebrate specific 
gene-duplication, one copy gained two additional WH2 domains leading to the actin nucleating cordon-bleu. The 
function of the ur-form of the cordon-bleu protein family is so far unknown. The identification of a homolog in the 
model organism Drosophila melanogaster could facilitate its experimental characterization. 
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Findings 

The actin cytoskeleton is one of the hallmarks of eukaryotic 
cells. It is involved in processes like movement, phagocyt- 
osis and morphogenesis, to name just a few [1]. Although 
its roots can be traced back to prokaryotes and archaea 
[2], eukaryotes have evolved an arsenal of actin associated 
proteins to orchestrate its functions. One of the key pro- 
cesses to be regulated is the transformation of the globular 
(G-Actin) to the filamentous form (F-actin). Although 
this process can happen spontaneously, it is kinetically 
unfavorable and therefore inefficient. In the eukaryotic 
cell, this process is supported by different proteins 
which assist in the nucleation, extension and branching 
of actin filaments [3,4]. The first described and arguably 
most prominent is the ARP2/3 complex, which can not 
only initiate the nucleation of new actin filaments but 
also their branching. Its origin can be traced back to the 
last common ancestor of the eukaryotes (LEG A) [5]. 
Within the last years further proteins crucial for the ef- 
ficient nucleation of actin filaments were identified. The 
actin nucleating FH2 domain is the hallmark of formins. 
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which can be found throughout the eukaryotic kingdom 
[6,7]. Only later an additional actin nucleation domain, 
namely WH2, was characterized [8]. Proteins containing 
this domain have been identified throughout the eukary- 
otes and a prokaryotic origin has been suggested [9]. One 
of the WH2 domain type actin nucleators is cordon-bleu, 
a protein involved in the development of the central ner- 
vous system [10], the neural tube [11] and motile cilia 
[12]. Gontrasting the other so far known actin nucleators, 
this gene was described as vertebrate specific [11]. This 
might be unexpected, considering that these evolutionary 
young proteins are needed for the regulation of a process 
already present in LEGA. We therefore set out to unravel 
the evolutionary roots of this gene. 

WH2 domains are short and divergent and thus hard 
to identify with standard sequence analysis approaches. 
We therefore in the first step focused on a second con- 
served domain in the N-terminus of cordon-bleu, the 
cobl domain. This domain belongs to the ubiquitin fold 
and its structure has been solved (PDB 2DAJ). A PSI- 
Blast search [13] with this sequence identified in the 
first search a significantly similar hit in Branchiostoma 
floridae (Gephalochordata, GI:260809417, E=2xl0"^). In 
the first iteration, hits were found in Saccoglossus 
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kowalevskii (Hemichordata, GI:291232830, E=10'^) and 
the sea urchin Strongylocentrotus purpuratus (Echino- 
dermata, GI:390346270, E=4xl0'^). Surprisingly, also 
proteins of arthropods, e.g. the honey bee Apis mellifera 
(GI:328778502, E=3xl0"^), and of Lophotrochozoa, namely 
the oyster Cmssostrea gigas (GI:405954164, E=0.002), the 
sea hdiVQ Aplysia californica (GI:524908855, E=2xl0'^) and 
the annelid Capitella teleta (GI:443710193, E=8xlO'^^), 
were hit. The species most distantly related to vertebrates 
identified in this iteration was Capsaspora owczarzaki, 
belonging to the Ichtyosporea (GI:470321394, E=0.005). 
In the next iteration, the profile showed significant 
similarity to further arthropod proteins including Dros- 
ophila melanogaster "proximal to raf (GI: 17864372, 
E=10"^). Finally, first bona fide ubiquitin like proteins 
(GI:227343644, Trypanosoma brucei E=0.003) were hit. 
Back searches with e.g. the Capitella teleta sequence 
identified additionally a protein in the platyhelminth 
Clonorchis sinensis (GI: 358341913, E=6xl0'^). We thus 
conclude that the cobl domain evolved at least with the 
emergence of the bilateria. 

To identify possible non-bilaterian members of the 
domain, we performed Hidden Markov Model searches 
[14] with an 85% non-redundant alignment of the hits 
identified with PSI-Blast against the proteomes of a placo- 
zoa {Trichoplax adhaerens [15]), a sponge {Amphimedon 
queenslandica [16]), cnidaria {Hydra magnipapillata 
[17] and Nematostella vectensis [18]) and Ctenophora 
(Mnemiopsis leidyi [19]). Indeed, a significantly similar 
protein was identified in the proteome of Nematostella 
vectensis (GI:156394513, E=3.8xl0'^). Still, this single 
occurrence leaves the presence of a cobl domain in the 
non-metazoan eukaryote Capsaspora owczarzaki enigmatic. 

But, do these proteins indeed belong to the cordon- 
bleu protein family? In vertebrates, a paralog of cordon- 
bleu, cordon-bleu related 1, exists. Whereas the first 
contains three C-terminal WH2 domains, the latter 
contains only one. We therefore set out to identify 
WH2 domains in the cobl domain containing proteins 
using sensitive HMM to HMM alignments [20,21]. In- 
deed, a HMM based on the arthropod sequences identi- 
fied in the first PSI-Blast iteration was significantly 
similar to the WH2 HMM from Pfam [22] (PF02205; 
E=4.3xl0'^). Therefore, these arthropod sequences con- 
tain not only the cobl, but also an additional C-terminal 
WH2 domain. With the same approach, we were able to 
identify a single WH2 domain at the C-terminus of the 
oyster protein (E=0.00025). Next, we aligned a HMM 
based on vertebrate cordon-bleu proteins against one 
based on the putative Drosophila homologs. In addition 
to the cobl domain, a significantly (E=0.065) similar 
region was identified between the C-terminus of both 
alignments. Taking the human cordon-bleu as refer- 
ence, this covered positions 1234 to 1247 and thereby at 



least parts of the C-terminal WH2 domain. This indi- 
cates a single WH2 domain in the C-terminus of the 
Drosophila "proximal to raf ' proteins. Despite the sensi- 
tivity of these approaches, no WH2 domain was pre- 
dicted in the Nematostella vectensis protein containing 
a cobl domain. Contrasting, Inter Pro identified a WHl 
(also found in Wiskott-Aldrich syndrome proteins) do- 
main in position 75-186. The cobl domain containing 
Capsaspora owczarzaki protein also lacks a WH2 do- 
main, but contains a WW, a PTB and a CRIB domain. 
Most interestingly, the latter is also found in WASp 
proteins. 

Taken together, we have shown that the evolutionary 
roots of the cordon-bleu (related) protein family lie 
before the first emergence of vertebrates. Its absence in 
some major metazoan lineages like the Nematoda can 
be explained by lineage specific losses. Thus, our results 
suggest the following evolutionary history of cordon- 
bleu proteins. First, the N-terminal cobl domain evolved 
from an existing ubiquitin fold. With the emergence of 
the bilateria, this domain was combined with a single 
WH2 domain. Finally, with the emergence of the verte- 
brates, a gene duplication evolved the cordon-bleu re- 
lated 1 and the cordon-bleu genes. In the latter, two 
additional WH2 domains were acquired. We were not 
able to predict whether these new domains arose via an 
internal duplication or were acquired from another pro- 
tein as the sequences were too short to calculate a reliable 
phylogeny. 

To perform as actin nucleator, a protein has to bring 
different monomeric actin molecules in close proximity. 
In cordon-bleu, this is achieved by the binding of actin to 
each of the three WH2 domains. Contrasting, cordon-bleu 
related 1 as well as the ur-form pre-dating the duplication 
contain only a single WH2 domain. Therefore, it is un- 
likely that these molecules have actin nucleation capabil- 
ities. Thus, this function evolved following the gene 
duplication. The identification of a candidate ortholog in 
the model organism Drosophila melanogaster could enable 
the functional characterization of the ur-form of the 
cordon-bleu protein family. Following, the adaptations 
leading to the vertebrate specific function of cordon-bleu 
could be traced. Thus, the cordon-bleu protein family 
could become a test case to study functional changes 
following gene duplications. 
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